<img src="hugo.gif" alt="hugo> <english>This is Hugo.</english> <suomi>Tämä on Hugo.</suomi>and as result, you want two documents: an English one
<img src="hugo.gif" alt="hugo> This is Hugo.and a Finnish one
<img src="hugo.gif" alt="hugo> Tämä on Hugo.This can easily be achieved by defined two macro set, one begin stored as english.hsc
<$macro english /close><$content></$english> <$macro suomi /close></$suomi>and another one stored as suomi.hsc
<$macro english /close></$english> <$macro suomi /close><$content></$suomi>
The first one defines two container macros, with <english>
simply every time inserting the whole content passed to it, and
<suomi>
always removing any content enclosed in it.
hsc english.hsc hugo.hsc to en-hugo.htmlit will look like the first output document described above. To gain a result looking like the second one, you only have to use
hsc suomi.hsc hugo.hsc to fi-hugo.html
This is simply because the macros declared in
suomi.hsc work just the other way round like those in
english.hsc: everything enclosed in <english>
will
be ignored, and everything being part of <suomi>
remains.
This version of hsc officially only supports Latin-1 as input character set. The exact definition of that is a bit messy, but basically it refers to most of those 255 characters you can input with your Amiga-Keyboard.
For this character set, all functions described herein should work, especially the CLI-option RPLCENT.
Although Latin-1 is widely used within most decadent western countries, it does not provide all characters some people might need. For instance those from China and Japan, as their writing systems work completely different.
As the trivial idea if Latin was to use 8 bit instead of the rotten 7 bit of ASCII (note that the ``A'' in ASCII, stands for American), the trivial idea of popular encodings like JIS, Shift-JIS or EUC is to use 8 to 24 bit to encode one character.
Now what does hsc say if you feed such a document to it?
Unless you do not specify RPLCENT, it should work
without much bothering about it. However, you will need a w3-browser
that also can display these encodings, and some fiddling with
<META>
and related tags.
If you think you are funny and enable RPLCENT, hsc will still not mind your input. But with great pleasure it will cut all your nice multi-byte characters into decadent western 8-bit ``cripplets'' (note the pun). And your browser will display loads of funny western characters - but not a single funny Japanese one.
Recently an old western approach to these encodings problems has gained popularity: Unicode - that's the name of the beast - was created as some waste product of the Taligent project around 1998 or so, as far as I recall.
Initially created as an unpopular gadget not supported by anything, it is now in everybody's mouth, because Java, the language-of-hype, several MS-DOS based operating systems and now - finally - the rotten hypertext language-of-hype support it. At least to some limited extent. (Technical note: Usually you only read of UCS-2 instead of UCS-4 in all those specifications, and maybe some blurred proposals to use UTF-16 later.)
As hsc is written in the rotten C-language (an American product, by the way), it can not cope with zero-bytes in its input data, and therefore is unable to read data encoded in UCS-4, UTF-16 or (würg, kotz, reiha) UCS-2; it simply will stop after the first zero in the input.
Because the rotten C-language is so widely used, there are some zero-byte work-around formats for Unicode, most remarkably UTF-8 and UTF-7. These work together with hsc, although with the same limitations you have to care for when using the eastern encodings mentioned earlier.
Note that it needs at least five encodings to make Unicode work with most software - again in alphabetical order: UCS-2, UCS-4, UTF-16, UTF-7 and UTF-8. I wonder what the "Uni" stands for...
Anyway, as conclusion: you can use several extended character sets, but you must not enable RPLCENT.Recently, html-4.0 was released, and it sucks surprisingly less (as far as "sucks less" is applicable at all to html). Of course there currently is no browser capable of displaying all these things, but nevertheless you can use hsc to author for it - with some limitations. This will shortly outline how.
As already mentioned, html now supports those extended character encodings. See above how to deal with input files using such an encoding, and which to avoid.
If your system does not allow you to input funny characters (for
instance one can easily spend ATS 500.000 on a Workstation just for
being absolutely unable to enter a simple ``ä''), you can use
numeric entities, both in their decimal or hexadecimal representation:
for example, to insert a Greek Alpha, you can use
Α
or Α
, hsc will accept
both. However, you still can not define entities beyond 8-bit range
using <$defent>
.
Some highlights are that the ALT
attribute of <IMG>
is no required and that there are now loads of ``URIs'' instead of
``URLs'' around. Nothing new for old hsc-users... he he he.
Another interesting thing is that the DTD now contains some meta-information that was not part of earlier DTDs so it maybe can make sense to use the DTD as a base for hsc.prefs.
Therefor, you can download some shabby ARexx-scripts from the support-w3-page for hsc that try to do exactly that task. The problem is that they are extremely lousy and the sgml-inventors might suggest the death penalty for my pseudo-sgml-parser, a pathetic ``masterpiece'' of reverse engineering. That's why you can not find it on Aminet or somewhere else.
But most likely you will not need this at all as I already included the output of these scripts in the main archive. Look for hsc-html-40.prefs and use it instead of hsc.prefs. Remember to create a backup of your old hsc.prefs before.
Viewing it, you will stumble across a currently nowhere documented tag called<$varlist>
. This is introduced to make it easier to
refer to groups of often used attributes with one name. For example,
you can use
<$varlist coreattrs id:id class:string style:string title:string>to daclare such a group with the name
coreattrs
,
consisting of the attributes id
, style
and
title
. Later on, in a declaration of some tag, you can
write this name enclosed into square brackets, instead of listening
all these attributes , like
<$deftag A .. [coreattrs] ..>instead of
<$deftag A .. id:id class:string..>Although this also works within macro declarations, it is officially undocumented and therefor discouraged to be used outside hsc.prefs.
The main reason for these scripts: They should point out how this principle could work. The current implementation is near to useless, but maybe someone writes a reasonable version. Definitely not me, as html-1.0 plus tables is all I ever will need. It's up to you, all you self-satisfied, passive and degenerated users.
As you can now optionally read this manual in a Postscript version, there might be some interest how it was done.
The rudimentarily bearable application used for conversion is (very originally) called html2ps and can be obtained from http://www.tdb.uu.se/~jan/html2ps.html. As common with such tools, "it started out as a small hack" and "what really needs to be done is a complete rewriting of the code", but "it is quite unlikely that this [...] will take place". The usual standard disclaimer of every public Perl-script. All quotes taken from the manual to html2ps.
Basically the html- and the Postscript-version contain the same words. However, there are still some differences, for example the printed version does not need the toolbar for navigation provided at the top of every html-page.
Therefore, I wrote two macros, <html-only>
and
<postscript-only>
. The principle works exactly like the one
described for <english>
and <suomi>
earlier in this
chapter, and you can find them in docs-source/inc/html.hsc and docs-source/inc/ps.hsc.
However, there is a small difference to the multi-lingual examples, as I do not really want to create two versions all the time. Instead, I prefer to create either create a fully hypertext featured version or a crippled Postscript-prepared html-document in the same location.
You can inspect docs-source/Makefile how this
is done: if make is invoked without any special options, the
hypertext version is created. But if you instead use make
PS=1 and therefor define a symbol named PS
, the
pattern rule responsible for creating the html-documents acts
differently and produces a reduced, Postscript-prepared document
without toolbar.
$(DESTDIR)%.html : %.hsc ifdef PS @$(HSC) inc/ps.hsc $(HSCFLAGS) $< else @$(HSC) inc/html.hsc $(HSCFLAGS) $< endif
Needless to say that the conditional in the Makefile does not work with every make - I used GNUmake for that, your make-tool maybe has a slightly different syntax.
For my convenience, there are two rules called rebuild
and rebuild_ps
with their meanings being obvious: they
rebuild the whole manual in the desired flavour.
So after a successful make rebuild_ps
, everything only
waits for html2ps. Maybe you want to have a look at the
docs-source/html2ps.config used, although it is
strait forward and does not contain anything special. This should not
need any further comments, as there is a quite useful manual supplied
with it.
However, making html2ps work with an Amiga deserves some remarks. As you might already have guessed, you will need the Perl-archives of GG/ADE - no comments on that, everybody interested should know what and where GG is.
I suppose you can try the full Unix-alike approach with hsc compiled for AmigaOS/ixemul and GG more or less taking over your machine, and therefor directly invoke perl. This will require a rule likeps : html2ps -W l -f html2ps.config -o ../../hsc.ps ../docs/index.html
As I am a dedicated hater of this, I used the AmigaOS-binary, a SAS-compiled GNUmake and the standard CLI. A usually quite successful way to make such things work is with the help of ksh, which, for your confusion, is in a archive at GG called something like pdksh-xxx.tgz (for ``Public Domain ksh''). Invoking ksh with no arguments will start a whole shell-session (würg!), but you can use the switch -c to pass a single command to be executed. After that, ksh will automatically exit, and you are back in your cosy CLI, just as if nothing evil has had happened seconds before.
So finally the rule to convert all those html-files into one huge Postscrip file on my machine is:
ps : ksh -c "perl /bin/html2ps -W l -f html2ps.config -o ../../hsc.ps ../docs/index.html"
Note that html2ps is smart enough to follow those
(normally invisible) <LINK REL="next" ..>
tags being part of
the html-documents, so only the first file is provided as argument,
and it will automatically convert the other ones.
Well, it least you see it can be done.